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Kallipyga massiliensis strain ph2 T is the type strain of Kallipyga massiliensis gen. nov., sp. nov., 
the type species of the new genus Kallipyga within the family Clostridiales Incertae Sedis XI. 
This strain, whose genome is described here, was isolated from the fecal flora of a 26-year- 
old woman suffering from morbid obesity. K. massiliensis is an obligate anaerobic coccus. 
Here we describe the features of this organism, together with the complete genome sequence 
and annotation. The 1,770,679 bp long genome (1 chromosome but no plasmid) contains 
1 ,575 protein-coding and 50 RNA genes, including 4 rRNA genes. 



Introduction 

Kallipyga massiliensis strain ph2 T (CSUR=P241, 
DSM=26229] is the type strain of K. massiliensis 
gen. nov., sp. nov. This bacterium was isolated 
from the stool sample of an obese French patient 
as part of a study aiming at individually cultivating 
all species occurring within human feces [1-3]. It 
is a Gram-positive, anaerobic, indole-negative 
coccus. Defining the taxonomic status of bacterial 
isolates remains a challenging task. The taxonomic 
molecular tools currently available, including 16S 
rRNA sequence similarity, G + C content and DNA- 
DNA hybridization (DDH] [4,5], although consid- 
ered as gold standards, have limitations [6,7]. The 
16S rRNA sequence similarity and G+C content 
thresholds do not apply uniformly to all species or 
genera, and the DDH method lacks intra- and in- 
ter-laboratory reproducibility [5]. The advent of 
high-throughput genome sequencing and proteo- 
mic analysis [8] has granted unprecedented access 
to exhaustive genetic and protein information for 
bacterial isolates. We recently proposed a 
polyphasic approach to describe new bacterial 
species in which genome sequences and MALDI- 
T0F spectra are used along with phenotypic char- 
acteristics [9-30]. 

The family Clostridiales Incertae Sedis XI (Garrity 
and Holt 2001] was created in 2001 [31] and cur- 
rently includes the 11 following genera: 
Anaerococcus (Ezaki et al. 2001] [32], 
Dethiosulfatibacter (Takii et al. 2007] [33], 



Finegoldia (Murdoch and Shah 2000] [34], 
Gallicola (Ezaki etal. 2001] [32], Helcococcus (Col- 
lins et al. 1993] [35], Parvimonas (Tindall and 
Euzeby 2006] [36], Peptoniphilus (Ezaki et al. 
2001] [32], Sedimentibacter (Breitenstein et al. 
2002] [37], Soehngenia (Parshina etal. 2003] [38], 
Sporanaerobacter (Hernandez-Eugenio et al. 
2002] [39] and Tissierella (Collins and Shah 1986] 
[40]. Currently, 31 species with validly published 
names are reported in this family [41]. The species 
listed in the Clostridiales Incertae Sedis XI are 
mostly comprised of Gram-positive, obligate an- 
aerobic cocci. Members belonging to this family 
were identified as pathogens in both humans and 
animals. In humans, they were often isolated from 
patients with septic arthritis, necrotizing pneu- 
monia, prosthetic joint infection and other clinical 
conditions associated with vaginal discharges and 
ovarian, peritoneal and sacral abscesses [42-46]. 

Here we present a summary classification and a 
set of features for K. massiliensis gen. nov, sp. nov, 
strain ph2T (CSUR=P241, DSM=26229] together 
with the description of the complete genomic se- 
quencing and annotation. These characteristics 
support the circumscription of the genus Kallipyga 
and its type species, K. massiliensis within the 
Clostridiales Incertae Sedis XI family. 
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Classification and features 

A stool sample was collected from a 26-year-old 
woman living in Marseille (France]. She suffered 
from morbid obesity and had a body mass index of 
48.2 (118.8 kg, 1.57 meter). At the time of stool 
sample collection she did not take any medication 
and was not on a diet. The patient gave an informed 
and signed consent, and the agreement of the ethics 
committee of the Institut Federatif de Recherche 
(IFR48, Faculty of Medicine, Marseille, France] was 
obtained under reference 09-022. Another four 
new bacterial species, Alistipes obesi, Peptoniphilus 
grossensis, P. obesi and Enorma massiliensis [25- 
27,33], were also isolated from this specimen using 



various culture conditions. The fecal specimen was 
preserved at -80°C after collection. Strain ph2 T 
(Table 1] was isolated in 2011 by anaerobic culture 
on 5% sheep blood-enriched agar in anaerobic at- 
mosphere at 37°C, following 26 days in a blood cul- 
ture bottle with rumen and sheep blood. The 16S 
rRNA nucleotide sequence (GenBank accession 
number JN837487] of Kallipyga massiliensis strain 
ph2 T was 86.09% similar to Helcococcus sueciensis, 
the phylogenetically closest species (Figure 1]. This 
value was lower than the 95.0% 16S rRNA gene 
sequence threshold recommended by Stackebrandt 
and Ebers (2006] to delineate a new genus without 
carrying out DNA-DNA hybridization [5]. 



Table 1. Classification and general features of Kallipyga massiliensis strain ph2 T according to the MIGS 



recommendations [45]. 



MIGS ID 


Property 


Term 


Evidence code 3 






Domain Bacteria 


TAS [47] 






Phylum Firmicutes 


TAS [48-50] 






Class Clostridia 


TAS [51,52] 




Current classification 


Order Clostridiales 


TAS [53,54] 






Family Clostridiales Incertae Sedis XI 


TAS [55] 






Genus Kallipyga 


IDA 






Species Kallipyga massiliensis 


IDA 






Type strain ph2 T 


IDA 




Gram stain 


Positive 


IDA 




Cell shape 


Cocci 


IDA 




Motility 


Non-motile 


IDA 




Sporulation 


Non-sporulating 


IDA 




Temperature range 


Mesophile 


IDA 




Optimum temperature 


37°C 


IDA 


MIGS-6.3 


Salinity 


unknown 


IDA 


MIGS-22 


Oxygen requirement 


Anaerobic 


IDA 




Carbon source 


Unknown 


NAS 




Energy source 


Unknown 


NAS 


MIGS-6 


Habitat 


Human gut 


IDA 


MIGS-15 


Biotic relationship 


Free living 


IDA 




Pathogenicity 


Unknown 






Biosafety level 


2 




MIGS-14 


Isolation 


Human feces 


NAS 


MIGS-4 


Geographic location 


France 


IDA 


MIGS-5 


Sample collection time 


January 201 1 


IDA 


MIGS-4. 1 


Latitude 


43.296482 


IDA 


MIGS-4. 1 


Longitude 


5.36978 


IDA 


MIGS-4. 3 


Depth 


Surface 


IDA 


MIGS-4.4 


Altitude 


0 m above sea level 


IDA 



Evidence codes - IDA: Inferred from Direct Assay; TAS: Traceable Author Statement (i.e., a direct report 
exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, iso- 
lated sample, but based on a generally accepted property for the species, or anecdotal evidence). These 
evidence codes are from the Gene Ontology project [56]. If the evidence is IDA, then the property was di- 
rectly observed for a live isolate by one of the authors or an expert mentioned in the acknowledgements. 
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1C 



— Tissierella praeacuta 

— Tissierella creatinophila 
Tissierella creatinini 

— Soehngenia saccharolytica 

— Sporanaerobacter acetigenes 

Dethiosulfatibacter aminovorans (NR 

Sedimentibacter hydroxybenzoicus 

Sedimentibacter saalensis 



L_ 



Peptoniphilus indolicus 
Peptoniphilus asaccharolyticus 



Parvimonas micra 
Finegoldia magna 
Gallicola barnesae 



10 



— Kallipyga massiliensis 

— Helcococcus sueciensis 
Helcococcus ovis 

— Helcococcus kunzii 

Anaerococcus vaginalis 

— Anaerococcus octavius 
Anaerococcus murdochii 



Anaerococcus tetradius 
-Anaerococcus prevotii 
Eubacterium cylindroides 



Figure 1. Phylogenetic tree highlighting the position of Kallipyga massiliensis strain ph2 T relative to other 
type strains within the Clostridiales Incertae Sedis XI family. Genbank accession numbers are indicated in 
parentheses. Sequences were aligned using CLUSTALW, and phylogenetic inferences obtained using the 
maximum-likelihood method in MEGA software. Numbers at the nodes are percentages of bootstrap val- 
ues obtained by repeating the analysis 500 times to generate a majority consensus tree. Eubacterium 
cylindroides was used as outgroup. The scale bar represents a 2% nucleotide sequence divergence. 



By comparison to the Genbank database [57], 
strain ph2 T also exhibited a nucleotide sequence 
similarity greater than 98.7% with 16 sequences 
from uncultured bacteria from the human skin 
microbiome [58]. These bacteria are most likely 
classified within the same species as strain ph2 T . 

Different growth temperatures (25, 30, 37, 45°C] 
were tested; no growth occurred at 25°C and 30°C, 
growth occurred between 37°C and 45°C, and op- 
timal growth was observed at 37°C. Colonies were 
bright grey with a diameter of 1.0 mm on 5% 
blood-enriched Columbia agar. Growth of the 
strain was tested under anaerobic and 



microaerophilic conditions using GENbag anaer 
and GENbag microaer systems, respectively 
(BioMerieux], and in the presence of air, with or 
without 5% CO2. Optimal growth was obtained 
anaerobically. No growth was observed under 
aerobic and microaerophilic conditions. Gram 
staining showed Gram-positive cocci (Figure 2]. A 
motility test was negative. Cells grown on agar are 
Gram-positive, have a diameter in electron mi- 
croscopy ranging from 0.57^im to 0.78[im (mean, 
0.67 \im, Figure 3] and are mostly grouped in 
pairs, short chains or small clumps. 
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Figure 2. Gram staining of K massiliensis strain ph2 T 



Figure 3. Transmission electron microscopy of K. massiliensis strain ph2 T using a Morgani 
268D (Philips) at an operating voltage of 60kV. The scale bar represents 200 nm. 

http://standardsingenomics.org 503 



Kallipyga massiliensis gen. nov., sp. nov. 



Strain ph2 T exhibited neither catalase or oxidase 
activities. Using API 32A (BioMerieux], nitrate re- 
duction, indole formation and urease production 
were negative. A positive reaction was obtained for 
a-galactosidase, arginine dihydrolase and arginine 
arylamidase, a-glucosidase and (B-glucosidase. Strain 
ph2 T did not ferment mannose or raffmose. Negative 
reactions were observed for (B-galactosidase, (B- 
galactosidase-6-phosphate, a-arabinosidase, (B- 
glucuronidase, N-acetyl-(B-glucosaminidase, glutamic 
acid decarboxylase, proline arylamidase, leucyl gly- 
cine arylamidase, phenylalanine arylamidase, 
pyroglutamic acid arylamidase, tyrosine 
arylamidase, alanine arylamidase, glycine 
arylamidase, histidine arylamidase, glutamyl glu- 
tamic acid arylamidase, and serine arylamidase. Us- 
ing an API Zym (BioMerieux], positive reactions 
were observed for esterase lipase, leucine 
arylamidase, a-glucosidase, (B-glucosidase and acid 
phosphatase. Negative reactions were obtained for 
esterase, lipase, valine and cysteine arylamidase, 
trypsine, a-chymotrypsine, naphthol-AS-BI- 
phosphohydrolase, a-galactosidase, (B-galactosidase, 
(B-glucuronidase, N-acetyl-fB-glucosaminidase, a- 
mannosidase and a-fucosidase. Using an API 50CH 
(BioMerieux], K. massiliensis weakly fermented D- 
ribose, D-glucose, D-fructose and aesculin. By com- 
parison with its closest phylogenetic neighbors, K. 
massiliensis differed from Finegoldia magna in a- 
galactosidase and a-glucosidase production, D- 
ribose and D-fructose fermentation. It also differed 
from Helcococcus kunzii in oxygen requirement, a- 
galactosidase and leucine arylamidase production, 
D-ribose, D-glucose and esculin utilization. It dif- 
fered from Parvimonas micro in alkaline phospha- 
tase, glutamyl glutamic acid arylamidase, (B- 
glucosidase, phenylalanine arylamidase and 
histidine arylamidase production and D-glucose 
fermentation. It differed from Peptoniphilus indolicus 
in a-galactosidase, indole, a-glucosidase, (B- 
glucosidase, dihydrolase phenylalanine, phenylala- 
nine arylamidase and histidine arylamidase produc- 
tion, and D-glucose fermentation (Table 2]. 

K massiliensis is susceptible to amoxicillin, amoxicil- 
lin-clavulanic acid, gentamicin 500, penicillin, 
imipenem, vancomycin, rifampicin and 
nitrofurantoin, but resistant to ciprofloxacin, metro- 
nidazole, gentamicin 10, trime- 
thoprim/sulfamethoxazole, ceftriaxon, erythromy- 
cin and doxycycline. 

Matrix-assisted laser-desorption/ionization time- 
of-flight (MALDI-TOF] MS protein analysis was 



carried out as previously described [59] using a 
Microflex spectrometer (Bruker Daltonics, Germa- 
ny). Twelve distinct deposits were done for strain 
ph2 T from 12 isolated colonies. The twelve ph2 T 
spectra were imported into our database and com- 
pared to spectra from 3,769 bacteria using the 
MALDI BioTyper software (version 2.0, Bruker). A 
score enabled the presumptive identification and 
discrimination of the tested species from those in a 
database: a score > 2 with a validly published spe- 
cies enabled the identification at the species level; a 
score > 1.7 but < 2 enabled the identification at the 
genus level; and a score < 1.7 did not enable any 
identification. For strain ph2 T , no significant score 
was obtained, suggesting that our isolate was not a 
member of any known species or genus (Figures 4 
and 5). A broader study incorporating MALDI-TOF 
and 16S rDNA and genomic DNA identity data may 
be conducted to define taxonomic criteria at the 
family level. 

Genome sequencing information 

Genome project history 

The organism was selected for sequencing on the 
basis of its phylogenetic position and 16S rRNA simi- 
larity to members of the family Clostridiales Incertae 
Sedis XI and is part of a study of the human digestive 
flora aiming at isolating all bacterial species within 
human feces [1-3]. It was the thirty-sixth genome 
from the family Clostridiales Incertae Sedis XI to be 
sequenced and the first genome of K. massiliensis 
gen. nov., sp. nov. The GenBank accession number is 
CAHC00000000 and consists of 22 contigs. Table 3 
shows the project information and its association 
with MIGS version 2.0 compliance [60]. 

Growth conditions and DNA isolation 

Kallipyga massiliensis gen. nov., sp. nov., strain ph2 T 
(CSUR= P241, DSM=26229) was grown anaerobi- 
cally on 5% sheep blood-enriched Columbia agar at 
37°C. Three petri dishes were spread and the bac- 
teria cultivated were resuspended in 3 x 100p.l of 
G2 buffer (EZ1 DNA Tissue kit, Qiagen). A first me- 
chanical lysis was performed by glass powder on 
the Fastprep-24 device (MP Biomedicals, USA) us- 
ing 2 x 20 seconds cycles. DNA was then treated 
with 2.5^ig/p.L lysozyme for 30 minutes at 37°C and 
extracted using the BioRobot EZ1 Advanced XL 
(Qiagen). The DNA was then concentrated and pu- 
rified on a QIAamp kit (Qiagen). The yield and con- 
centration was measured by the Quant-it Picogreen 
kit (Invitrogen) on the Genios Tecan fluorometer at 
78.2ngM 
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Table 2. Differential characteristics of Kallipyga massiliensis gen. nov., sp. nov., strain ph2 T , Finegoldia magna strain 
ATCC 29328, Helcococcus kunzii strain ATCC 51366, Parvimonas micra strain ATCC 33270 and Peptoniphilus 
indolicus strain ATCC 29427 T . 



Properties 


K. massiliensis 


F. magna 


H. kunzii 


P. micra 


P. indolicus 


Cell diameter ((jm) 


0.67 


na 


na 


0.3-0.7 


na 


Oxygen requirement 


anaerobic 


anaerobic 


facultative anaerobic 


anaerobic 


anaerobic 


Colony color 


bright gray 


var 




na 


na 


Gram stain 


+ 


+ 


+ 


+ 


+ 


Salt requirement 


- 


na 


+/- 


na 


- 


Motility 


- 


- 


- 


na 


- 


Endospore formation 


- 


- 


na 


na 


- 


Production of 












Alkaline Phosphatase 




+/- 




+ 




Catalase 




+/- 




na 


na 


Oxidase 




na 


na 


na 


na 


Nitrate reductase 








na 


na 


Urease 






na 


na 




ot-galactosidase 


+ 


- 


- 


na 


- 


|3-galactosidase 


- 


- 


- 


na 


- 


1 1 1UUI fc; 






113 




+ 


ArtJininP 1 * an/ hminacp 
/\\ t±\i \ \ i \ cu y i di 1 1 1 \Jcidc 


4. 
I 


_|_ 


I id 


_|_ 


_|_ 


Glutamyl glutamic acid 












arylamidase 




na 


na 


+ 


na 


Arginine dihydrolase 


+ 


+/- 


na 


na 




ot-glucosidase 


+ 




na 


na 




|3-glucosidase 


+ 


na 


na 






|3-glucuronidase 








na 


na 


Phenylalanine arylamidase 


- 


- 


na 


+ 


+ 


Esterase lipase 


+ 


na 


na 


na 


na 


Leucine arylamidase 


+ 


+ 


- 


na 


+ 


Cystine arylamidase 




na 


na 


na 


na 


Histidine arylamidase 


- 


-/w 


na 


+ 


+ 


Fermentation of 












D-mannose 


- 


- 


na 


na 


- 


D-ribose 


w 






na 


na 


D-glucose 


w 


-/w 








D-fructose 


w 


+ 


na 


na 


na 


Esculin 


w 


na 


+ 


na 


na 












mastitis of 


Isolated from 


human gut 


human 


human 


human 


cattle 



na = data not available; var = variable; w = weak 
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Figure 4. Reference mass spectrum from K. massiliensis strain ph2 T . Spectra from 12 individual colonies were com- 
pared and a reference spectrum was generated. 



Spectrum lumber 
Peptoniphilus indolicus DSM 
20464 



Kallipyga massiliae strain ph2 T 4 



Helcococcus ovis DSM 21504T 



Fineguldia magna DSM 20362 



Anaerococcus prevotii DSM 
20473 




8000 
mfz 



Figure 5. Gel view comparing Kallipyga massiliensis sp. nov strain ph2 T to other phylogenetically close species. The 
gel view displays the raw spectra of loaded spectrum files arranged in a pseudo-gel like look. The x-axis records the 
m/z value. The left y-axis displays the running spectrum number originating from subsequent spectra loading. The 
peak intensity is expressed by a Gray scale scheme code. The color bar and the right y-axis indicate the relation be- 
tween the color a peak is displayed with and the peak intensity in arbitrary units. Displayed species are indicated 
on the left. 
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Table 3. Project information 



MIGS ID 


Property 


Term 


MIGS-31 


Finishing quality 


High-quality draft 


MIGS-28 


Libraries used 


454 GS paired-end 3- kb libraries 


MIGS-29 


Sequencing platform 


454 GS FLX Titanium 


MIGS-31. 2 


Sequencing coverage 


51.23 x 


MIGS-30 


Assemblers 


Newbler 


MIGS-32 


Gene calling method 


Prodigal 




Genbank Date of Release 


May 30, 2012 




NCBI project ID 


CAHC00000000 


MIGS-13 


Project relevance 


Study of the human gut microbiome 



Genome sequencing and assembly 

DNA (5 [ig] was mechanically fragmented on a 
Hydroshear device (Digilab, Holliston, MA, USA] 
with an enrichment size at 3-4kb. The DNA frag- 
mentation was visualized through the Agilent 
2100 BioAnalyzer on a DNA labchip 7500 with an 
optimal size of 3.179kb. A 3kb paired-end library 
was constructed according to the 454 GS FLX Ti- 
tanium paired-end protocol (Roche]. Circulariza- 
tion and nebulization were performed and gener- 
ated a pattern with an optimal at 600 bp. After 
PCR amplification through 17 cycles followed by 
double size selection, the single stranded paired- 
end library was quantified on the Quant-it 
Ribogreen kit (Invitrogen] on the Genios Tecan 
fluorometer at 58 pg/(j.L. The library concentra- 
tion equivalence was calculated as 1.77E+08 mol- 
ecules/uL. The library was stored at -20°C until 
further use. 

The paired-end library was clonally amplified 
with 0.5 cpb and 1 cbp in 2 SV-emPCR reactions 
with the GS Titanium SV emPCR Kit (Lib-L] v2 
(Roche]. The yields of the emPCR were essentially 
the same at 12.3 and 12%, in the range of 5 to 
20% recommended by the Roche procedure. 

Approximately 790,000 beads were loaded on 1/4 
region of a GS Titanium PicoTiterPlate PTP Kit 
70x75 and sequenced with the GS FLX Titanium 
Sequencing Kit XLR70 (Roche]. The run was per- 
formed overnight and then analyzed on the cluster 
through the gsRunBrowser and gsAssembler 
(Roche]. A total, of 261,794 passed filter wells 
were obtained and generated 90.68 Mb with a 
length average of 346 bp. The global passed filter 
sequences were assembled using Newbler with 
90% identity and 40 bp as overlap. The final as- 
sembly identified 3 scaffolds and 22 large contigs 



(> 1,500 bp] generating a genome size of 1.77 Mb 
which corresponds to a coverage of 51.23x ge- 
nome equivalent. 

Genome annotation 

Open Reading Frames (ORFs] were predicted us- 
ing Prodigal [61] with default parameters. How- 
ever, the predicted ORFs were excluded if they 
spanned a sequencing gap region. The predicted 
bacterial protein sequences were searched against 
the GenBank [57] and Clusters of Orthologous 
Groups (COG] databases using BLASTP. The tRNAs 
and rRNAs were predicted using the tRNAScan-SE 
[62] and RNAmmer [63] tools, respectively. Lipo- 
protein signal peptides and numbers of 
transmembrane helices were predicted using 
SignalP [64] and TMHMM [65], respectively. 
ORFans were identified if their BLASTP £"-value 
was lower than le-03 for alignment length greater 
than 80 amino acids. If alignment lengths were 
smaller than 80 amino acids, we used an £-value 
of le-05. Such parameter thresholds have already 
been used in previous works to define ORFans. 
Artemis [66] and DNA Plotter [67] were used for 
data management and visualization of genomic 
features, respectively. Mauve alignment tool (ver- 
sion 2.3.1] was used for multiple genomic se- 
quence alignment [68]. To estimate the mean level 
of nucleotide sequence similarity at the genome 
level between K. massiliensis and four other mem- 
bers of the family Clostridiales Incertae Sedis XI 
(Table 6], orthologous proteins were detected us- 
ing the Proteinortho software [69] and genomes 
compared two by two. For each pair of genomes, 
we determined the mean percentage of nucleotide 
sequence identity among orthologous ORFs using 
BLASTn. 
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Genome properties 

The genome is 1,770,679 bp long (one chromo- 
some, no plasmid] with a G+C content of 51.40% 
(Figure 6 and Table 4]. Of the 1,625 predicted 
chromosomal genes, 1,575 were protein-coding 
genes and 50 were RNAs. A total of 1,238 genes 



(76.18%] were assigned a putative function. For- 
ty-two genes were identified as ORFans (2.66%] 
and the remaining genes were annotated as hypo- 
thetical proteins. The properties and statistics of 
the genome are summarized in Tables 4 and 5. 
The distribution of genes into COGs functional cat- 
egories is presented in Table 5. 



1700000 



170000 



1360000- 




340000 



510000 



1190000 



cSOOOO 



1020000 



S50000 



Figure 6. Graphical circular map of the chromosome. From the outside in, the outer two circles show the genes on 
the forward and reverse directions (colored by COG categories). The third circle marks the rRNA operon in red and 
tRNA genes in green. The fourth circle shows the G+C% content plot. The inner-most circle shows GC skew, with 
purple and olive indicating negative and positive values, respectively. 
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Table 4. Nucleotide content and gene count levels of the chromosome 



Attribute 


Value 


/o oi toxai 


Genome size (bp) 


\ ,//u,b/y 




DNA coding region (bp) 


1 IT on IT 1 Q 

1 ,590,528 


on o i 


dina u+L. content [op) 


Q1 n no 

y i u, i / y 


D 1 .4U 


Total genes 


1 ,OZO 


I UU 


RNA genes 




o.U/ 


Protein-coding genes 


1,575 


96.92 


Genes with function prediction 


1,238 


76.18 


Genes assigned to COGs 


1,165 


71.69 


Genes with peptide signals 


90 


5.53 


Genes with transmembrane helices 


405 


24.92 



•"The total is based on either the size of the genome in base pairs or the total number 
of protein-coding genes in the annotated genome 



Table 5. Number of genes associated with the 25 general COG functional categories 



Code Value % age 3 Description 



J 


131 


4.86 


Translation 


A 

A 


0 


0.032 


RNA processing and modification 


K 


75 


5.31 


Transcription 


L 


107 


5.74 


Replication, recombination and repair 


B 


0 


0 


Chromatin structure and dynamics 


D 


1 5 


0.78 


Cell cycle control, mitosis and meiosis 


Y 


0 


0 


Nuclear structure 


V 


51 


1.53 


Defense mechanisms 


T 
1 


1 1 

A A 


i .oy 


Signal transduction mechanisms 


M 


72 


3.42 


Cell wall/membrane biogenesis 


N 


1 


0 


Cell motility 


Z 


0 


0 


Cytoskeleton 


w 


0 


0 


Extracellular structures 


u 


13 


0.84 


Intracellular trafficking and secretion 


o 


45 


2.47 


Posttranslational modification, protein turnover, chaperones 


c 


66 


4.53 


Energy production and conversion 


G 


88 


2.87 


Carbohydrate transport and metabolism 


E 


67 


6.16 


Amino acid transport and metabolism 


F 


47 


2.05 


Nucleotide transport and metabolism 


H 


34 


2.34 


Coenzyme transport and metabolism 


1 


28 


4.01 


Lipid transport and metabolism 


P 


59 


4.14 


Inorganic ion transport and metabolism 


Q 


4 


0.81 


Secondary metabolites biosynthesis, transport and catabolism 


R 


127 


8.15 


General function prediction only 


S 


113 


5.93 


Function unknown 




410 


26.03 


Not in COGs 


The total 


is based 


on the total 


number of protein-coding genes in the annotated genome 
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Genomic comparison of K. massiliensis 
and other members of the family 

Clostridiales Incertae Seel is XI. 

Currently, 35 genomes are available for members 
of the family Clostridiales Incertae Sedis XI. Here, 
we compared the genome sequence of K. 
massiliensis strain ph2 T with those of Finegoldia 
magna strain ATCC 29328, Helcococcus kunzii 
strain ATCC 51366, Peptoniphilus indolicus strain 
ATCC 29427 and Parvimonas micra strain ATCC 
33270. The draft genome of K. massiliensis 
(1.77Mb] is smaller than all other genomes except 
P. micra (1.70Mb] and exhibits a higher G+C con- 
tent (51.40] (Table 6A]. The gene content of K. 



massiliensis is also lower than the other four ge- 
nomes used for comparison (Table 6B]. In addi- 
tion, K. massiliensis shared 653, 549, 592 and 548 
orthologous genes with F. magna, H. kunzii,P. 
indolicus and P.micra respectively. The average 
nucleotide sequence identity ranged from 58.61 to 
69.17% among Clostridiales Incertae Sedis XI fami- 
ly species, and from 58.61 to 59.97% between K. 
massiliensis and other species, thus confirming its 
new genus status (Table 6B]. 



Table 6A. Genomic comparison of K. massiliensis gen. nov., sp. nov., strain ph2 T with four other members of the fami- 
ly Clostridiales Incertae Sedis X\ f 



Species 



Strain 



Genome accession number 



Genome size (Mb) G+C content 



K. massiliensis 


ph2 T 


CAHC00000000 


1,770,679 


51.40 


F. magna 


ATCC 29328 


NC_010376 


1,797,577 


32.1 


H. kunzii 


ATCC 51366 


AGEI01 000000 


2,083,191 


29.40 


P. indolicus 


ATCC 29427 


AGBB01 000000 


2,101,630 


31.70 


P. micra 


ATCC 33270 


ABEE02000000 


1,703,772 


28.70 


+ Species and strain names, GenBank genome accession 


numbers, sizes and G+C contents. 




Table 6B. Genomic comparison of K. massiliensis gen. 
family Clostridiales Incertae Sedis Xl + 


nov., sp. nov., strain ph2 T with four other members of the 




K. massiliensi F. 


magna H. kunzii 


P. indolicus 


P. micr 


K. massiliensis 


1,568 


635 549 


592 


548 


F. magna 


59.22 


1,656 629 


687 


665 


H. kunzii 


59.06 


68.20 1,878 


561 


560 


P. indolicus 


59.97 


67.98 67.57 


2,205 


615 


P. micra 


58.61 


69.17 68.52 


68.64 


1,597 



+ Numbers of orthologous proteins shared between genomes (upper right), average percentage of nucleotide similarity 
of orthologous proteins shared between genomes (lower left). Bold numbers indicate numbers of proteins per genome. 
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Conclusion 

On the basis of phenotypic, phylogenetic and ge- 
nomic analyses, we formally propose the creation 
of Kallipya massiliensis gen. nov., sp. nov., that con- 
tains the strain ph2 T . This bacterium has been 
found in France. 

Description of Kallipyga gen. nov. 

Kallipyga (cal.li.pi'ga N.L. fern. N. Kallipyga of the 
Greek epithet kallipygos, said of a statue of Aphro- 
dite having beautifully proportioned buttocks]. 

Gram-positive cocci. Strictly anaerobic. 
Mesophilic. Non-Motile. Does not exhibit catalase, 
oxidase and indole production nor nitrate reduc- 
tion. Positive for a-galactosidase, arginine 
dihydrolase, arginine arylamidase, and a- and (3- 
glucosidase. Habitat: human digestive tract. Type 
species: Kallipyga massiliensis. 

Description of Kallipyga massiliensis gen. 
nov., sp. nov. 

Kallipyga massiliensis (mas.il'ien'sis. L. gen. fern. n. 
massiliensis, of Massilia, the Latin name of Mar- 
seille where was cultivated strain ph2 T ). It has 
been isolated from the feces of an obese French 
patient. 

Gram-positive cocci. Strictly anaerobic. 
Mesophilic. Optimal growth at 37°C. Non-motile 
and non-sporulating. Colonies are bright grey with 
1.0 mm in diameter on blood-enriched Columbia 
agar. Cells are cocci with a diameter ranging from 

0. 57 (im to 0.78 um with a mean diameter of 0.67. 
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